Search Result

Select

Text multi-label classification method incorporating BERT and label semantic attention

Xueqiang LYU, Chen PENG, Le ZHANG, Zhi’an DONG, Xindong YOU

Journal of Computer Applications 2022, 42 (1): 57-63. DOI: 10.11772/j.issn.1001-9081.2021020366

Abstract （1409）

HTML （72）

PDF （577KB）（1236）

Save

Multi-Label Text Classification （MLTC） is one of the important subtasks in the field of Natural Language Processing （NLP）. In order to solve the problem of complex correlation between multiple labels， an MLTC method TLA-BERT was proposed by incorporating Bidirectional Encoder Representations from Transformers （BERT） and label semantic attention. Firstly， the contextual vector representation of the input text was learned by fine-tuning the self-coding pre-training model. Secondly， the labels were encoded individually by using Long Short-Term Memory （LSTM） neural network. Finally， the contribution of text to each label was explicitly highlighted with the use of an attention mechanism in order to predict the multi-label sequences. Experimental results show that compared with Sequence Generation Model （SGM） algorithm， the proposed method improves the F value by 2.8 percentage points and 1.5 percentage points on the Arxiv Academic Paper Dataset （AAPD） and Reuters Corpus Volume I （RCV1）-v2 public dataset respectively.

Table and Figures | Reference | Related Articles | Metrics

Select

Classification of steel surface defects based on lightweight network

SHI Yangxiao, ZHANG Jun, CHEN Peng, WANG Bing

Journal of Computer Applications 2021, 41 (6): 1836-1841. DOI: 10.11772/j.issn.1001-9081.2020081244

Abstract （445）

PDF （981KB）（361）

Save

Defect classification is an important part of steel surface defect detection. When the Convolutional Neural Network (CNN) has achieved good results, the increasing number of network parameters consumes a lot of computing cost, which brings great challenges to the deployment of defect classification tasks on personal computers or low computing power devices. Focusing on the above problem, a novel lightweight network model named Mix-Fusion was proposed. Firstly, two operations of group convolution and channel-shuffle were used to reduce the computational cost while maintaining the accuracy. Secondly, a narrow feature mapping was used to fuse and encode the information between the groups, and the generated features were combined with the original network, so as to effectively solve the problem that "sparse connection" convolution hindered the information exchange between the groups. Finally, a new type of Mixed depthwise Convolution (MixConv) was used to replace the traditional DepthWise Convolution (DWConv) to further improve the performance of the model. Experimental results on NEU-CLS dataset show that, the number of floating-point operations and classification accuracy of Mix-Fusion network in defect classification task is 43.4 Million FLoating-point Operations Per second (MFLOPs) and 98.61% respectively. Compared to the networks of ShuffleNetV2 and MobileNetV2, the proposed Mix-Fusion network reduces the model parameters and compresses the model size effectively, as well as obtains the better classification accuracy.

Reference | Related Articles | Metrics

Select

Target-dependent method for authorship attribution

Yang LI, Wei ZHANG, Chen PENG

Journal of Computer Applications 2020, 40 (2): 473-478. DOI: 10.11772/j.issn.1001-9081.2019101768

Abstract （432）

HTML （0）

PDF （650KB）（409）

Save

Authorship attribution is the task of deciding who is the author of a particular document， however， the traditional methods for authorship attribution are target-independent without considering any constraint during the prediction of authorship， which is inconsistent with the actual problems. To address the above issue， a Target-Dependent method for Authorship Attribution （TDAA） was proposed. Firstly， the product ID corresponding to the user review was chosen to be the constraint information. Secondly， Bidirectional Encoder Representation from Transformer （BERT） was used to extract the pre-trained review text feature to make the text modeling process more universal. Thirdly， the Convolutional Neural Network （CNN） was used to extract the deep features of the text. Finally， two fusion methods were proposed to fuse the two different information. Experimental results on Amazon Movie_and_TV dataset and CDs_and_Vinyl_5 dataset show that the proposed method can increase the accuracy by 4%-5% compared with the comparison methods.

Table and Figures | Reference | Related Articles | Metrics

Select

Population model of giant panda ecosystem based on population dynamics P system

TIAN Hao, ZHANG Gexiang, RONG Haina, Mario J. PÉREZ-JIMÉNEZ, Luis VALENCIA-CABRERA, CHEN Peng, HOU Rong, QI Dunwu

Journal of Computer Applications 2018, 38 (5): 1488-1493. DOI: 10.11772/j.issn.1001-9081.2017102551

Abstract （460）

PDF （1014KB）（346）

Save

Giant panda pedigree data is an important data base for studying the population dynamics of giant pandas. Therefore, it is of great significance for data modeling of giant panda ecosystems from the perspective of panda conservation. Focused on this issue, a data modeling method of giant panda ecosystem based on population dynamics P system was proposed. Based on the giant panda pedigree data released by Chinese Association of Zoological Gardens, the population characteristics of captive pandas were simulated and researched in China Giant Panda Conservation Research Center from individual behavior. The change rules of reproductive parameters were analyzed in detail, and added to the field released module. Eventually, a population dynamic P system for giant panda was designed releasing-to-the-wild with a two-layer nested membrane structure, a collection of objects and a series of evolution rules which is inline with the characteristics of giant panda. For all giant panda, the maximum relative error between the simulation results and the actual data was within ±4.13% and basically controlled within ±2.7% of P system. The experimental results verify the effectiveness and soundness of the proposed model. It can simulate the population dynamic change trend of giant panda and provide the basis for management decision-making.

Reference | Related Articles | Metrics

Select

Reduction method of test suites based on mutation analysis

WANG Shuyan, CHEN Pengyuan, SUN Jiaze

Journal of Computer Applications 2017, 37 (12): 3592-3596. DOI: 10.11772/j.issn.1001-9081.2017.12.3592

Abstract （489）

PDF （825KB）（589）

Save

The scale of test suites is constantly expanding and the cost of testing is increasing due to the change of test requirements in the process of regression testing. In order to solve the problems, a Reduction method of Test suites based on the analysis of Mutation (RTM) was proposed. Firstly, the test suites were classified and the transaction set matrix of mutants was created in binary numerical form according to whether the designated mutants could be detected or not by test suites. Then, the correlation relation between test suites was obtained by using the improved association mining algorithm. Finally, the test suites were effectively reduced according to these relations. The simulation experimental results of the six classical programs show that, the test suite reduction rate of the proposed RTM can reach 37%. Compared with the traditional greedy algorithm and heuristic algorithm, the proposed RTM improves the test suite reduction rate by 6%, and can guarantee the test coverage rate at the same time, even the test coverage rate of a single test suite increases by 11% on average. The proposed method can meet more test requirements by using fewer test suites, effectively improving test efficiency and reducing test cost.

Reference | Related Articles | Metrics

Select

Simultaneous iterative hard thresholding for joint sparse recovery based on redundant dictionaries

CHEN Peng, MENG Chen, WANG Cheng, CHEN Hua

Journal of Computer Applications 2015, 35 (9): 2508-2512. DOI: 10.11772/j.issn.1001-9081.2015.09.2508

Abstract （451）

PDF （756KB）（274）

Save

For improving recovery performance of signals sampled by sub-Nyquist sampling system with Compressed Sensing (CS), the block Simultaneous Iterative Hard Thresholding (SIHT) recovery algorithm for joint sparse model based on ε-closure was proposed. Firstly, The CS synthesis model for Multiple Measurement Vector (MMV) of sampling system was analyzed and the concepts of ε-coherence and Restricted Isometry Property (RIP) were proposed. Then, according to the block coherence of redundant dictionaries, the SIHT algorithm was improved by optimizing the support sets in iterations. In addition, the iterative convergence constant was given and the algorithm convergence property was analyzed. At last, the simulation experiments show that, compared with traditional method, the new algorithm can achieve recovery success rate of 100% with enough sampling channels, while the noise suppressing ability was increased by 7 dB to 9 dB and the total execution time was brought down by at least 37.9%, with higher convergence speed.

Reference | Related Articles | Metrics

Select

Study of human motion tracking system based on wireless sensor network

CHEN Pengzhan, LI Jie, LUO Man

Journal of Computer Applications 2015, 35 (8): 2316-2320. DOI: 10.11772/j.issn.1001-9081.2015.08.2316

Abstract （401）

PDF （942KB）（357）

Save

To solve the attitude drift, low real-time ability and high price problem in motion capture system based on inertial sensors, a kind of real-time motion capture system was designed to effectively overcome the attitude drift with low cost and power consumption. At first, a distributed joint motion capture node was built based on the human body kinematics principle, and every node worked in low-power mode, when the acquisition data from the node was lower than a predetermined threshold, the node would automatically enter into the sleep mode to reduce the power consumption of the system. In order to reduce the data drift in traditional algorithm, a kind of algorithm combined with inertial navigation and Kalman filter algorithm was designed to calculate the real-time motion data. Using the Wi-Fi module, the TCP-IP protocol was adopted to transmit the attitude data, which could drive the model in real time. At last, the accuracy of the algorithm was evaluated on the multi-axis motor test platform, and the effect of the system for tracking real human motion was compared. The experimental results show that the algorithm has higher accuracy by contrast with the traditional complementary filtering algorithm, which can control the angle drift in less than one degree; and the delay has no obvious lag by contrast with the complementary filter, which can realize the accurate tracking of human motion.

Reference | Related Articles | Metrics

Select

Key technologies of dynamic information database for power systems

HUANG Haifeng ZHANG Keheng ZHANG Hong JI Xuechun CHEN Peng

Journal of Computer Applications 2011, 31 (06): 1681-1684. DOI: 10.3724/SP.J.1087.2011.01681

Abstract （1130）

PDF （650KB）（10311）

Save

In the paper, on the basis of analyzing the structure of dynamic information database, and in combination with the feature of the power system, the key technologies of concurrency data processing, memory-mapped file, disk cache management mechanism and associated data storage were discussed, and the data sampling flow and hybrid compression algorithm were also introduced in detail. The application case in the automatic system of power grid dispatching was introduced and the result proves that the dynamic information database can meet the performance requirement of high-speed data processing.